Focus and Element Length for Book and Wikipedia Retrieval
نویسندگان
چکیده
In this paper we describe our participation in INEX 2010 in the Ad Hoc Track and the Book Track. In the Ad Hoc track we investigate the impact of propagated anchor-text on article level precision and the impact of an element length prior on the within-document precision and recall. Using the article ranking of an document level run for both document and focused retrieval techniques, we find that focused retrieval techniques clearly outperform document retrieval, especially for the Focused and Restricted Relevant in Context Tasks, which limit the amount of text than can be returned per topic and per article respectively. Somewhat surprisingly, an element length prior increases withindocument precision even when we restrict the amount of retrieved text to only 1000 characters per topic. The query-independent evidence of the length prior can help locate elements with a large fraction of relevant text. For the Book Track we look at the relative impact of retrieval units based on whole books, individual pages and multiple pages.
منابع مشابه
The Impact of Document Level Ranking on Focused Retrieval
Document retrieval techniques have proven to be competitive methods in the evaluation of focused retrieval. Although focused approaches such as XML element retrieval and passage retrieval allow for locating the relevant text within a document, using the larger context of the whole document often leads to superior document level ranking. In this paper we investigate the impact of using the docum...
متن کاملOUC's Participation in the 2011 INEX Book Track
In this article we describe the Oslo University College’s participation in the INEX 2009 Book track. This year’s tasks have been featuring complex topics, containing aspects. These lend themselves to use in both the book retrieval and the focused retrieval tasks. The OUC has submitted retrieval results for both tasks, focusing on using the Wikipedia texts for query expansion, as well as utilizi...
متن کاملUniversity of Amsterdam at INEX 2010: Ad hoc and Book Tracks
In this paper we describe our participation in INEX 2010 in the Ad Hoc Track and the Book Track. In the Ad Hoc track we investigate the impact of propagated anchor-text on article level precision and the impact of an element length prior on the within-document precision and recall. Using the article ranking of an document level run for both document and focused retrieval techniques, we find tha...
متن کاملFocused Search in Books and Wikipedia: Categories, Links and Relevance Feedback
In this paper we describe our participation in INEX 2009 in the Ad Hoc Track, the Book Track, and the Entity Ranking Track. In the Ad Hoc track we investigate focused link evidence, using only links from retrieved sections. The new collection is not only annotated with Wikipedia categories, but also with YAGO/WordNet categories. We explore how we can use both types of category information, in t...
متن کاملFrom Focused Elements to Snippets A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Supraja Nagalla IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE
Information Retrieval is a field of computing which traditionally deals with searching a large collection of documents and retrieving documents based on their similarity to the query. INEX [10] provides a platform (e.g., document collection, queries and uniform evaluation metrics) for the development and evaluation of retrieval algorithms for XML documents. The focus of INEX is to reduce the gr...
متن کامل